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ABSTRACT 

In order to understand the formation and subsequent evolution of galaxies one must 
first distinguish between the two main morphological classes of massive systems: spirals 
and early-type systems. This paper introduces a project, Galaxy Zoo, which provides 
visual morphological classifications for nearly one million galaxies, extracted from the 
Sloan Digital Sky Survey (SDSS). This achievement was made possible by inviting 
the general public to visually inspect and classify these galaxies via the internet. The 
project has obtained more than 4 x 10^ individual classifications made by ~ 10^ par- 
ticipants. We discuss the motivation and strategy for this project, and detail how the 
classifications were performed and processed. We find that Galaxy Zoo results are con- 
sistent with those for subsets of SDSS galaxies classified by professional astronomers, 
thus demonstrating that our data provides a robust morphological catalogue. Obtain- 
ing morphologies by direct visual inspection avoids introducing biases associated with 
proxies for morphology such as colour, concentration or struct ual parameters. In ad- 
dition, this catalogue can be used to directly compare SDSS morphologies with older 
data sets. The colour-magnitude diagrams for each morphological class are shown, 
and we illustrate how these distributions differ from those inferred using colour alone 
as a proxy for morphology. 

Key words: methods: data analysis, galaxies: general, galaxies: spiral, galaxies: 
elliptical and lenticular 



1 INTRODUCTION 

Dividing galaxies into categories based on their morphology, 
or shapes, has been standard practice since it was first sys- 
tematically applied by Hubble (1936). It is perhaps surpris- 
ing that sorting galaxies into categories which are suggested 
solely by their morphology produces classifications which 
broadly correlate with other, physical parameters such as the 
star formation rate or gas fraction. The fundamental distinc- 



tion drawn is between galaxies with spiral arms and early- 
type system^j. For most of the twentieth century, catalogues 
of classified galaxies were c ompiled by ind i viduals or small 
teams of astronomers (e.g. ISandagd Il96ll : Ide VaucouleursI 
Il99lh . With the advent of modern surveys (such as the Sloan 
Digitial Sky Survey or SDSS, see Section II. 1|) containing 



many hundreds of thousands of galaxies this approach was 
no longer practical. 



^ This publication has been made possible by the participation 
of more than 100,000 volunteers in the Galaxy Zoo project, 
t Email: cjl@astro.ox.ac.uk 
X Email: kevins@astro.ox.ac.uk 



^ For the purposes of this paper we use the term 'elliptical' rather 
than 'early-type' as this is the description used on the Galaxy Zoo 
site. However, the term should be understood as including both 
elliptical and lenticular systems. 
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Anticipating t he problem these surveys would cause, 
iLahav et alJ (|l995[ ) compared classifications from a set of 
experts who considered a sample of just over 800 galax- 
ies. Their motivation was to create a training set for neu- 
ral networks, with the aim of automating the classifica- 
tion process. While such methods have inde ed been devel- 
oped ( BalLeLaL.2004), modern studie s (e.g. iBernardi et al.l 



oped maJl_e£_aL.,^yy4), moaern stuaie s [e.g. i-bernardi et al.l 
I2OO3I : iLintott. Ferreras fc Lahavl I2OO6I I still separate early- 



type galaxies from spirals in large data sets by using 
proxies for morphology rather than by directly determin- 
ing morphology itself. Typically, selection criteria based 
on galaxy properties such as colour, concentration in- 
dex, spectral features, surface brightness profile, struc- 
tural parameters or some combination of these are used 
(e.g. Abraham, van den Bersh & Nair 2003; Conselice 20061 
J Kauffmann et al.ll2004l : IScarlata et al.ll2007l : IStrateva et all 
l200lh . However, the use of each of these critera results in an 
unknown and potentially unquantifiable bias in the result- 
ing sample of galaxies. In other words, although morpholog- 
ical labels are often used for the resulting catalogues, each 
of these criteria produces a sample different from that ob- 
tained by true morphological selection. Comparing results 
from samples selected using different morphological proxies 
can therefore be misleading. 

Avoiding such confusion between categories is inher- 
ently desirable, but is of particular importance - to give just 
one example - for studies which seek to understand the influ- 
ence of star formation on the larger-scale process of galaxy 
formation. The colour of a galaxy is often used as a proxy for 
morphology, but is also a direct consequence of and therefore 
depends on the star- format ion history of the galaxy being 
studied. By directly classifying objects according to their 
morphology, the catalogue is sorted according to their dy- 
namics; spirals are rotating, whereas ellipticals are tri-axial 
(jBinnev et al.lll982i ). Other information such as colours, the 
presence or absence of emission lines, or the galaxy profiles 
can then be used to investigate the properties of the clas- 
sified galaxies, rather than being used in the classification 
itself. 

Several subsets of the SPSS have been cl assified by pro- 
fessional astronomers. iFukugita et al.l (|2007|) recently com- 
piled a catalogue of early- type objects by the visual inspec- 
tion of ^ 2500 galaxies in the SDSS by three expert classi- 
fiers. T his is an order of magnit ude smaller than the sample 
used bv lSchawinski et al.1 (|2007l ) for their study of AGN feed- 
back in early- type galaxies. Their sample, (MOSES: Mor- 
phologically Selected Ellipticals in SDSS), was obtained by 
carrying out manual inspection of all objects in the SDSS 
DR4 spectroscopic sample with redshift 0.05 < z < 0.10 
and r-band magnitude r < 16.8. The resulting sample con- 
sists of 48,023 galaxies, or approximately 5% of the complete 
SDSS galaxy sample (see below). This sample was then in- 
spected to identify galaxies with an elliptical morphology. 
The importance of such a morphology- driven classification 
can be seen from the comparision of MOSES ellipticals with 
those selected by Bernardi et al (2003). Of the ellipticals 
selected by Bernardi et al., 5% show emission lines indica- 
tive of star- forming activity compared to 18% of the MOSES 
sample. The sample selected by morphology alone includes a 
set of star-forming galaxies that are excluded from samples 
selected by other methods. 

Despite the desirability of pure morphological classifi- 



cation, the samples provided by SDSS and other modern 
surveys are simply too large for astronomers to visually in- 
spect the entire catalogue. Furthermore, without multiple 
independent classifications of the same galaxy, it is diffi- 
cult to establish how much confidence can be placed in the 
classifier or classifiers. Ideally, large numbers of independent 
classifications would be made for each galaxy in the sample, 
allowing the errors to be quantified. 

In this paper we present the results of an attempt 
to solve this problem by inviting large numbers of peo- 
ple to classify galaxies over the internet. This solution - 
known as 'crowdsourcing' or 'citizen science' - had been 
successfully employed by projects su ch as Stardust@Home 
(IWestphal et al.1 I2OO6I : iMended [JOOJI ) . This project was a 
search for interstellar dust particles in the sample collected 
by the Stardust spacecraft from Comet Wild- 2, with the 
initial selection of samples for further analysis being made 
by visual inspection. Galaxy Zoo involves an order of mag- 
nitude more participants than its predecesors, and is the 
first attempt to apply these techniques to astrophysical 
problems. Visual inspection is also an excellent method for 
serendipitous discovery of the unusual in any data set, and 
the more unusual objects discovered by Galaxy Zoo classi- 
fiers will be discussed in a series of future papers. 



1.1 The Sloan Digital Sky Survey 

The galaxies for thi s project were d rawn from the Sloan 
Digital Sky Survey (JYork et al.l[2000l l. The SDSS is a sur- 
vey of a large part of the northern sky providing pho - 
tometry in five filters; u, g, r, i and z (|Fukugital 1 19961 ). 
covering approximately 26% of the entire sky. We use the 
latest available data, con t ained in Data Release 6 (DR6; 
'Adelman- McCarthv et al.1 (l2007f)l. The SDSS spectroscopic 
target selection algorithm (IStra uss et al.ll2002h produces the 
Main Galaxy Sample, which includes all extended ob jects 
with Petrosian magnitude r < 17.77 (|Petrosianlll976l l. All 
objects in this sample which the SDSS photometric pipeline 
(Lupton|[200J) identified as a galaxy were included in the 
Galaxy Zoo database, regardless of whether or not such spec- 
tra have been obtained to date. This list included a total of 
738,175 galaxies drawn from the SDSS main galaxy cata- 
logue. In addition, objects which were not in the spectro- 
scopic catalogue but which had already been observed and 
as a result classified as a galaxy by the SDSS spectroscopic 
pipeline were added to our list. This secondary selection 
comprised 155,037 objects drawn from both the main and 
luminous red galaxy SDSS catalogues. In all, 893,212 objects 
were included in our sample. It is reasonable to assume that 
the accuracy of classification of a galaxy will depend on fac- 
tors which including the apparent size of the system and its 
surface brightness. However, as the biases were unquantified 
before the study was completed, no cuts on this inclusive 
sample were imposed. 
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2 GALAXY ZOO 

The data for this project was cohected via a websitqj- In 
order to minimize the degree of knowledge needed by the 
volunteers, users of the site were not required to distinguish 
between elliptical and lenticular galaxies, or between dif- 
ferent classes of spirals (Sa, Sb etc). Visitors to the site 
were asked to read a brief tutorial giving examples of each 
class of galaxy, and then to correctly identify a set of 'stan- 
dard' galaxies. These standard systems were selected from 
the SDSS and classified by team members; those with a low 
degree of agreement were rejected. Those who correctly clas- 
sified more than 11 of the 15 standards were allowed to pro- 
ceed to the main part of the site. The bar to entry was kept 
deliberately low in order to attract as many classifiers to the 
site as possible. 

The front page of the site and the main classifi- 
cation page are shown in Figure [T] SDSS images are 
shown to volunteers using; the Img;C utout web service 
(|Nieto-Santisteban. Szalav fc Gravll2004l ) on the SDSS web- 
site (Szalay et al. 2002). The service displays a JPEG cutout 
image of an area of sky, centered on a galaxy randomly 
chosen from the sample database, with an image scale of 
0.024i?p arcseconds per pixel where Rp is the Petrosian ra- 
dius for the galaxy. These images are colour composites of 
the three middle filters available in SDSS {g,r and i). Details 
of the conversion to colour images are given in Lupton et 
al. 2004. Traditional morphological classifications have used 
single-band images in order to avoid confusion between mor- 
phology and colour. That said, these colour images are par- 
ticularly suitable for visual classification. In particular, they 
possess the large dynamic range necessary for the identifica- 
tion of faint features, and have a unique mapping between 
physical and display colours. The effect of this choice on the 
data is discussed in section l4Tl 

In addition to sorting galaxies according to their 
morphology, the website asked classifiers to further di- 
vide galaxies they identified as spiral into three sub- 
categories according to the direction of their spiral arms 
(clockwise/anticlockwise/edge-on). This is a reliable in- 
dicator of the sense of the rotation of the galaxy 
(jPasha fc Smirnovlll982l ). The motivation for this part of 
the study is twofold. Firstly, we aim to investigate the evi- 
dence for a preferred handedness of spiral galaxies reported 
in the SDSS by Longo (2007). This result conflicts with an 
earlier paper by Suga i &: Ivd (|l995l ). who did not find such a 
result in a different (but comparably sized) dataset. Longo 's 
work was based on a sample of 2817 spirals selected by eye 
from galaxies in the SDSS with a redshift less than 0.04 
and a magnitude of ^ < 17. We have been able to extend 
his analysis to a sample which contains a factor of ten more 
galaxies, and the results are presented in a companion paper 
(Land et al., 2008). Secondly, it will also prove possible to 
use our results to calculate the two point correlation func- 
tion for rotating spirals, an interesting new constraint on 
models of galaxy formation (Slosar et al. 2008). 

Each object extracted from the SDSS database was 
thus classified as belonging to one of six categories: Spi- 
ral (Clockwise rotation). Spiral (Anticlockwise rotation). 
Spiral (Edge-on/rotation unclear). Elliptical, Merger, or 

2 www.galaxyzoo.org 
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Figure 1. Front page (top) and main analysis page (bottom) 
from the Galaxy Zoo website. 



Star/Don't Know. The symbols used for this classification 
are shown in Table [T] In order to keep the task as simple 
as possible, no further distinction was made between barred 
and unbarred spiral systems, for example. Once a classifica- 
tion is chosen, then the image of the next galaxy is auto- 
matically displayed. 

As the Galaxy Zoo website gathers data, these are 
stored into a live Structured Query Language (SQL) 
database. For each entry we store the timestamp, user iden- 
tification, galaxy identification and the classification chosen 
by the user. Classifications by unregistred visitors are dis- 
carded and the user requested to register and complete the 
tutorial described above. For the analysis, this database may 
be downloaded and processed through the pipeline described 
below. 

Although some classifiers will inevitably know (or will 
learn from experience during the project) that spiral galaxies 
tend to be bluer than elliptical galaxies, the tutorial stressed 
that objects should be classified according to their morphol- 
ogy alone. No mention was made of the colour- morphology 
relation. In order to quantify the effect of colour on our re- 
sults, a selection of monochrome images was introduced to 
the sample, and the results are discussed in Section [4. 11 The 
data discussed in this paper represent the final results from 
the first stage of the project; Galaxy Zoo 2, which will ask 
for more detailed classifications, will follow shortly. 
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Button 



Description 
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r^ 



Elliptical galaxy 
Clockwise/Z-wise spiral galaxy 

Anti-clockwise/S-wise spiral galaxy 

Spiral galaxy other (eg. edge on) 

Star or Don't Know (eg. artefact) 

Merger 



Table 1. Galaxy Zoo classification categories showing schematic 
symbols as used on the site. 




Figure 2. Cumulative classifications collected by the Galaxy Zoo 
site. The sudden increase visible at ~ 145 and r^ 160 days corre- 
spond to email newsletters being sent out to those registered with 
the site. These led to a sustained increase in the rate of classifi- 
cation. Following day 140, data collected contributed to the bias 
study described in Section |4?71 



3 PRODUCING A CATALOGUE 

The website was successful in attracting large numbers of 
classifiers and classifications, as shown in Figures [2] and O 
Each galaxy in our sample was thus viewed and classified 
multiple times, with a mean of ^38 classifications per galaxy. 
A variety of strategies are available to convert from these 
raw classifications to a final catalogue. In this section we 
compare the results from several different strategies. 

The first step in data reduction involves removing obvi- 
ously bogus classifications. A small number of users seem to 
have recorded a number of these classifications, either using 
some sort of automated mechanism or due to some unknown 
problem with their browser. They are easy to discern by the 
fact that they have multiple classifications for a small num- 
ber of galaxies. We find all users which have classified two or 
more galaxies more than five times each. This is extremely 
unlikely by Poisson distribution and hence all data points 
from such users are discarded. There are 36 such poten- 
tially malicious users, amounting to less than 0.05% of the 
total number of participants. Furthermore, in order to ac- 
count for accidental double cUcks, if a user has classified the 
same galaxy more than once, we take into account only the 
first classification from each user. This latter stage ensures 
that no single user can unduly influence the classiflcation 
assigned to a single galaxy. The two steps of this cleaning 
process together remove about 4% of our data set. 




F ill iiiiil I I I iiiiil I I I M i n i I I I m ill I I I M ini i i H 

1 10 100 1000 104 105 

Nunaber of classifications 

Figure 3. The distribution of classifications among users. A small 
number have completed more than 100,000 classifications each, 
while the peak of the distribution is ~ 30 classifications per user. 



In the next step we create the so called combined spirals 
sample. In this sample we combine all three possible spiral 
classifications into a single classification. This is useful for 
studies that require just a simple split into elliptical and 
spiral samples. All the subsequent analysis is performed on 
both the separated spirals (SS) and combined spirals (CS) 
samples. 

We are then in a position to create the unweighted 
(UW) final sample. The simplest method involves giving all 
classifiers equal weight and simply calculating the distribu- 
tion of classification for each galaxy. This distribution can 
now be interpreted in a Bayesian manner: it represents our 
state of knowledge about that particular galaxy. 



3.1 Weighted sampling 

The unweighted method discussed above does not discrimi- 
nate between results from those who think carefully before 
classifying each galaxy, and those who take less time. Neither 
does it distinguish between the ability of our classifiers. It 
may therefore make sense to attempt to identify particularly 
'good' users. The meaning of 'good' is naturally subjective, 
but one obvious strategy is to pay more attention to classifi- 
cations from users who tend to agree with the majority. For 
this analysis, each user of the website was initially assigned 
a unit weight, as in the unweighted sample described above. 
A preliminary classifcation could then be obtained as before 
for each galaxy. The weighting assigned to individual users 
could then be adjusted according to how they agree with this 
assesment. A new set of galaxy classifications could then be 
prepared using the new weights, and the process repeated 
until the weights converge. 

In order to avoid the weightings assigned to users be- 
ing distorted by the fainter end of our galaxy sample, we 
used only galaxies with petrosian radius Vp > 4.5 arcsec and 
r <17 for our weighting. This leaves 257,000 galaxies in- 
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volved in producing user weightings, although the resulting 
weightings are applied to all galaxy classifications. 

The algorithm used was as follows: let the weight of 
a user, /c, be Wk and set all initial weights to 1. We then 
integrate the database to find hi (j) , the number of users who 
classified galaxy i as being class j (elliptical, anticlockwise 
spiral etc.). Ng (k) is the number of galaxies classified by 
user k. The weights and hi are then updated by using the 
formulae: 



h^ U) 



E 



Wk, 



(1) 



(k=users who voted j for galaxy i) 



where A is chosen so that the mean user weight is one. 



and 



Wk 



E 



hi (j chosen by user k for galaxy i 



(2) 



This process can then be repeated until covergence. The 
final product is the weighted sample of galaxy classifications 
and a set of user weights. 

It should be noted, however, that the process of 
reweighting favours the majority opinion. A user that is 
most similar to other users will get upweighted and an user 
that does not conform to the pattern will get downweighted. 
However, the overall agreement between users does not nec- 
essarily mean improvement as people can agree on a wrong 
classification. These effects must be calibrated using com- 
parison with standardized observations as described in Sec- 
tion H 

The distribution of user weights for both the separated 
and combined spiral samples is shown in Figure lU Both 
distributions are slightly skewed toward the low-weighted 
end, and the combined spiral distribution is tighter than 
that for the seperated spiral data. This reflects the fact that 
as there are fewer possibilities to chose from in classifying 
a galaxy, for a set number of classifications better signal to 
noise is obtained, allowing us to better constrain the user 
weights. 

There are thus four possible combinations of separated 
spirals/Combined Spirals and Weighted/Unweighted sam- 
ples. Unless otherwised stated, we use the weighted sample. 
For each sample we distill the data further into clean and 
superclean samples. The galaxy is in a clean or superclean 
sample if it has more than ten votes (in practice, this applies 
to almost our entire sample) and if 80% or 95% of users (or 
user weights in the case of the weighted sample) respectively 
agree on its type. These are extremely strong limits; an 80% 
agreement is the equivalent of a 5- a detection for a galaxy 
with ten classifications, or a 10- a detection for a galaxy with 
the mean number of classifications. 

Examples of objects in each category randomly ex- 
tracted from the weighted superclean sample are shown in 
Figure [5] and examples from the clean sample in Figure [G] 
It should be noted that the combined spiral sets cannot be 
recovered simply by taking the single spirals clean set and 
combining classes 2, 3 and 4. For example, a galaxy that 
has all its votes evenly split between classes 2 and 3 (clock- 
wise and anticlockwise spirals) will definitely be included in 
the combined spiral clean set, but would not appear in the 
separated spirals clean set. 

The effect of this weighting process is shown in Table 
[2] for the separate spirals and in Table [3] for the combined 
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Figure 4. The distribution of user weights for the separated (solid 
line) and combined (dashed line) data sets. The distribution for 
the separated spirals is slightly wider than that for the separated 
spirals sample. 



spirals. These tables show that in all cases the vast majority 
(above 99% in all but two cases) of classifications made in the 
unweighted samples are carried forward into the weighted 
sample. This means that the weighting described above is 
not changing classifications. However, extra galaxies are in- 
cluded in each classification in the weighted sample. This 
effect is largest (~ 15%) for the elliptical classes; this can 
easily be explained by the fact that it is more difficult to 
agree on the presence of spiral structure than on an ellipti- 
cal morphology. A smaller proportion of spiral systems reach 
the stringent criteria for inclusion in the superclean sample. 
In total, more than 300,000 galaxies are included in the most 
inclusive sample; this is the largest sample of morphologi- 
cally clasified galaxies by a factor of 10. 

Inspection of Tables [2] and [3] immediately reveal that 
many more galaxies in the clean sample have been classi- 
fied as elliptical than spiral. The ellipticahspiral ratio is '^ 3 
for both the weighted and unweighted clean sample. The 
combined spiral clean sample produces a much lower ratio 
(^2). This difference is another illustration of the discrim- 
ination against spirals discussed in the previous paragraph. 
The combined spirals data should be free of such effects, 
but still has a large elliptical fraction. This refiects the ten- 
dency of our users to classify objects which are faint, small 
in angular extent on the sky or both as elliptical if no spi- 
ral features are present. It is therefore important to apply 
magnitude cuts to the data before using data for the popu- 
lation as a whole; individual users of the Galaxy Zoo data 
will require different cuts and so we do not impose any on 
the clean sample ourselves. The 'true' elliptical fraction for a 
volume-limited sample of well-classified galaxies is discussed 
in Section O 
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sample 



class 



^ in weighted ^ in unweighted % increase % common 



clean 


elliptical 


219326 


184743 


15.8 


99.98 


clean 


CW spiral 


17571 


17100 


2.7 


99.70 


clean 


acw spiral 


18946 


18471 


2.5 


99.72 


clean 


other spiral 


27310 


26037 


4.7 


99.46 


clean 


star/don't know 


8134 


8074 


0.7 


99.75 


clean 


merger 


1062 


961 


9.5 


99.48 


superclean 


elliptical 


26200 


19121 


37.0 


99.7 


superclean 


CW spiral 


6532 


6106 


7.0 


99.4 


superclean 


acw spiral 


7486 


7034 


6.4 


99.4 


superclean 


other spiral 


4760 


4247 


12.1 


99.2 


superclean 


star/don't know 


5589 


5393 


3.6 


99.5 


superclean 


merger 


70 


62 


12.9 


96.8 



Table 2. Comparison of classification between weighted and unweighted samples. For each class the Table shows the number of galaxies 
so classified, the percentage increase in weighted over unweighted classifications and the percentage of the unweighted sample in common 
with the weighted sample. 



Combined Spirals 
sample 



class 



^ in weighted ^ in unweighted % increase % common 



clean 


elliptical 


208437 


184743 


12.8 


99.9 


clean 


spiral 


101855 


97848 


4.1 


99.9 


clean 


star/don't know 


8126 


8074 


0.6 


99.9 


clean 


merger 


1056 


961 


9.9 


99.5 


superclean 


elliptical 


23806 


19121 


24.5 


99.7 


superclean 


spiral 


34673 


32559 


6.5 


99.7 


superclean 


star/don't know 


5573 


5393 


3.3 


99.7 


superclean 


merger 


67 


62 


8.1 


96.8 



Table 3. As tabled but for the combined spirals data set. 



4 COMPARISION WITH OTHER SAMPLES 

In order to assess the reliability of the Galaxy Zoo classifica- 
tions, we compare our sample with that produced by previ- 
ous projects. The MOSES sample (Schawinski et al., 2007) 
described in Section 1 consists of 15729 galaxies classified 
as elliptical selected from an initial set of 48023 galaxies. Of 
the 48023 the clean sample includes classifications for 19649 
systems. The results for the weighted clean sample are given 
in Table H 

More than 99.9% of the galaxies classified as MOSES 
ellipticals which are in the Galaxy Zoo clean sample are 
found to be ellipticals by Galaxy Zoo. However, ^ 15% of 
the ellipticals included in both the Galaxy Zoo clean sample 
and MOSES were not classified as elliptical by MOSES. All 
MOSES ellipticals in the superclean sample are classified by 
Galaxy Zoo as ellipticals, but the sample contains 3% more 
ellipticals than MOSES. These extra ellipticals are the result 
of the different motivation of the studies; MOSES was an at- 





moses e 


moses other 


moses all 


Elliptical 

ACW spiral 

CW spiral 

Other spiral 

Star/don't know 

Merger 


10,858 

2 
4 

4 


1,676 
2,493 
2,598 
1,940 

4 

70 


12,534 
2,493 
2,600 
1,944 

4 

74 


tot 


10,868 


8,781 


19,649 



Table 4. Comparision of classifications for galaxies in both 
MOSES and the Galaxy Zoo weighted clean sample. Most 
MOSES ellipticals are classified by Galaxy Zoo as elliptical. 



tempt to produce a very clean set of ellipticals, whereas the 
Galaxy Zoo samples include more of the SO-Sa continuum in 
the resulting sample. The Galaxy Zoo instructions to volun- 
teers did not mention disks at all, and so galaxies which are 
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Figure 5. Examples of galaxies in each class drawn from the 
weighted superclean sample. Each image is 51.2 x 51.2 arcsec. 



Figure 6. Examples of galaxies in each class drawn from the 
weighted clean sample. Each image is 51.2 x 51.2 arcsec. 
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Figure 7. Weighted vote in class 1, corresponding to ellipticals, 
for the 15729 galaxies classified in the MOSES sample as elliptical. 
Those with a weight in this class greater than 80% are included in 
the Galaxy Zoo clean sample, but it is clear from this figure that 
this is an effectively arbitary cut-off point. For approximately 90% 
of the galaxies, the majority of weighted votes are in the elliptical 
category. 



elliptical in morphology but have visible disks would have 
been included in Galaxy Zoo but not in MOSES. 

There are also ellipticals which are in the MOSES cat- 
alogue but not in the clean sample. The distribution of 
weighted votes for MOSES ellipticals including both those 
included in the clean sample and those which are not is 
shown in Figure [T] The majority of weighted votes in almost 
all cases support an elliptical classification. The requirement 
for the clean classification of a weighted vote of 80% thus 
lies in the middle of a continuous distribution of weights. In 
most cases, the remaining votes show that a small minority 
of users selected other options, usually for good reasons such 
as the presence of a nearby satellite trail or some evidence 
of a disturbed morphology. The weighted vote in the spiral 
categories is below 20% in all but an insignificant number 
of cases. This example thus illustrates the stringency of the 
clean and superclean samples; only galaxies on which a large 
majority of users agree are included in the final samples. 

In order to investigate this effect and provide an inde- 
pendent check on the data, we consider another set of SDSS 
galaxies, those classified by Fukugita et al. (2007). They use 
the statistic 'T' as their classification, taken from an aver- 
age of 3 classifiers (rounding to the nearest half integer). 
The options available are: 

(E), 1 (SO), 2(Sa), 3(Sb), 4(Sc), 5(Sd), and 6(Im). Unlike 
MOSES, therefore, we can use this smaller sample to probe 
the response of Galaxy Zoo users to galactic morphologies of 
a wide variety of sub-types. 1 ^ T ^ 5 are 'spiral' systems, 
T = 0, 0.5 are 'elliptical' and T < and T = 6 are unclassi- 
fied or irregular systems. Of their sample of 2275 galaxies we 
have clean classifications for 1300 galaxies (621 are included 
in the superclean sample) . The mean T for galaxies included 
in the clean sample and classified as elliptical is 0.52, and 
that for spirals 2.54. A full comparision for the clean sample 
is given in Table O and the distribution of weights shown in 
Figure [8] 

The vast majority of galaxies classified as elliptical in 
the clean sample are classified as elliptical (T=0,0.5) by 
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Figure 8. Histographs showing a comparision between Galaxy 
Zoo classifications and those from Fukugita et al. The axis plots 
the fraction of the weighted vote in the clean sample that Galaxy 
Zoo allocated to elliptical (top) and spiral (bottom) for those 
galaxies classified by Fukugita et al. as elliptical (el) as ellipit- 
cal/sO (El/SO), SO, SO/Spiral (SO/Sp) and Spiral (Sp). 



Fukugita et al. Of the ellipticals in the clean sample 92% cor- 
respond to early-type galaxies in the Fukugita et al. sample 
(SO or ellipticals). The equivilent figure for the superclean 
sample is 99%. All but two of the remaining galaxies classi- 
fied by Galaxy Zoo as ellipticals were classified by Fukugita 
et al. as Sa. This supports the hypothesis that the excess 
of ellipticals seen when comparing to the MOSES sample is 
composed mostly of Sa galaxies; astronomers are more re- 
luctant than the general public to classify something with 
a definite disk as an elliptical galaxy. No mention of disks 
was made in the instructions to our classifiers, but such an 
addition is an obvious change to make in future versions 
of the website. For the ellipticals, we find no obvious trend 
between T and magnitude, but T appears to be correlated 
with the weight of the classification. Switching to the super- 
clean sample therefore improves the correlation between the 
samples. 

Finally, iLongol (|2007|) selected the spiral galaxies used 
in his study by visual inspection of galaxies in the SDSS. 
Of the 2834 galaxies in this sample, 2498 are included in 
the Galaxy Zoo clean sample, 2491 of which are classifed as 
spirals. The other seven are classified as mergers (4) or as 
'star/don't know' (3). A comparision with the clean separate 
spiral catalogue finds excellent agreement for the winding 
sense of the spiral arms (99.6%). In the 10 cases where there 
was a disagreement, further inspection reveals that the dis- 
agreement can in each case be put down to human error in 
the catalogue of Longo (2007), illustrating the advantage of 
obtaining multiple independent classifications for each sys- 
tem. 

The three data sets with which we have compared 
Galaxy Zoo were compiled in very different ways to test dif- 
ferent hypotheses. However, in each case we find a remark- 
able degree of agreement (better than 90% in most cases) 
between our data set and those compiled by professional as- 
tronomers. We can therefore conclude that using data from 
volunteers will not substantially degrade the quality of the 
resulting catalogue while expanding the number of classified 
galaxies by a large factor. 



4.1 Measuring bias 

The aim of the Galaxy Zoo study was to produce a cata- 
logue of morphologically selected galaxies, independent of 
the bias introduced by using proxies for morphology. Poten- 
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Table 5. Comparision of the combined spirals clean (top) and superclean (bottom) sample results with those from Fukugita et al. 2007. 
Their classification is given on the x-axis, and the Galaxy Zoo results on the y-axis. See tabled for details of our classification system. 



class 



Original 

< % > (a) 



Monochrome 

< % > (a) 



Mirrored 

< % > {(j) 



1 53.82 (0.12) 

2 & 3 &; 4 32.37 (0.13) 

5 10.12 (0.06) 

6 3.69 (0.05) 



55.96 (0.12) 

28.97 (0.13) 
11.06 (0.06) 
4.01 (0.05) 



55.02 (0.12) 
30.05 (0.13) 
11.26 (0.06) 
3.67 (0.05) 



Table 6. Results of the bias study. The numbers given are the 
average percentage of votes that each class receives per galaxy, 
with 1 sigma errors obtained from jackknife resampling (see Land 
et al. for details). 



tially the strongest of these biases is the correlation between 
morphology and colour. While the instructions to users did 
not include any mention of colour, it is a fact that most 
spirals are significantly bluer than most elliptical galaxies, 
and this fact will be quickly learnt by classifiers. It is there- 
fore possible that our selection will include a residual colour 
bias. In order to quantify the size of any such effect, a pro- 
gramme of bias testing was undertaken. Users were shown 
either a mirror image of the original data, or a monochrome 
image produced from the coloured images. (These images 
are not single filter images, but rather a black and white 
version of the colour image produced by the SDSS pipeline 
as described above). The results are given in Table (6] 

Any bias study such as this runs the risk of chang- 
ing the behaviour of those taking part itself, a pheno menon 
known in social science as the Hawthorne effect (Mavol l 19331 : 
lAdair. Sharpe fc Huvnhlll989r ). To give just one example of 
how this might affect Galaxy Zoo, users may be more cau- 
tious with their classifications if they think that they are 
being tested for bias rather than just being asked to make 
their best guess. 

A change in user behaviour between the original clas- 
sifications and those collected as part of this bias study is 
indeed seen. In particularly, users are more careful in their 
classifications during the bias study. This effect makes it im- 
possible to make a fair comparision between classifications 
made before the bias study started and those collected dur- 
ing it. However, we do not expect mirroring the images to in- 
fluence the choice between spiral and elliptical galaxies, and 
we can thus use the mirrored images as a control. The result 



of a comparision between classifications of monochrome and 
mirrored images is a significant (of order 5- a) difference in 
behaviour. Users shown monochrome images are more likely 
to classify a galaxy as an elliptical, and correspondingly less 
likely to classify a galaxy as a spiral. There is also a bias 
in favour of classifying a galaxy as a merger; this is pre- 
sumably due to the loss of colour information which enables 
us to distinguish two seperate galaxies from one merging 
system. However, although these are statistically significant 
differences, they are small. The mean percentage of votes for 
the elliptical class increases from 55% to 56%, for example. 
We are thus justified in ignoring this bias when using the 
catalogues for most purposes. 

By using the monochrome images as a control, we can 
test for a bias in the classification of the direction of spiral 
arms. A significant bias in favour of anticlockwise classifica- 
tions was found, and is discussed in Land et al. (2008). We 
also expect a bias toward elliptical galaxies for more dis- 
tant systems as it becomes harder to resolve features which 
would indicate a spiral system. Providing a conservative cut 
in magnitude, size or redshift (or some combination of the 
three) is made, then this bias can be safely ignored. When 
considering the properties of the population as a whole, it 
is possible to be more quantitative in accounting for the ef- 
fect of this bias on the results; for a full discussion of this 
technique, see Bamford et al. 2008. 



5 COLOUR-MAGNITUDE DIAGRAMS 

In Figures [9l and [TOl we show the colour magnitude diagrams 
for those galaxies in our superclean sample which have spec- 
troscopic magnitudes. The magnitudes and colours are based 
on absolute magjnitudes calculated using kcorrect v4_l_4 
([Blanton k, Roweisll2007l ) . The elliptical galaxies in the sam- 
ple have a mean u-r of 2.55, significantly redder than the spi- 
rals (mean ix-r=1.85). These results correspond to the clas- 
sic 'red sequ ence' of early- type galaxi e s foun d by previous 
studies (e.g. ISandage &: VisvanathanI (|l978[ ): iBower et alJ 
(1998)), with the blue galaxies existing not on a tightly de- 
fined sequence but rather in a 'blue cloud'. The division 
between the two is not straightforward, however. For exam- 
ple, close inspection of Figure [10] reveals that the sample 
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Figure 9. Colour-magnitude diagram for galaxies in the weighted 
superclean combined spirals sample. Systems classified as spiral 
are shown in black, those classified as elliptical in red. 




Figure 10. Colour-magnitude histograms for galaxies in the 
clean combined spirals sample. Crosses mark ellipticals, diamonds 
spirals. A two-Gaussian fit to the complete data is shown (top), 
together with the individual Gaussians used in the fit. The curve 
shown is the limit for the main galaxy catalogue; objects below 
this line were drawn from the LRG sample. 



contains populations of both blue elliptical galaxies (which 
are discussed in companion paper to this, Schawinski et al. 
2008) and red spirals (the morphology- density relation for 
which is shown in Bamford et al. 2008 and which will be 
discussed in a future paper). 

In particular. Figure [10] includes a fit to the data with 
two Gaussians. The rest frame colours used in these plots 
are ca lculated using k-corrections from iBlanto n & Roweis 
(|2007l ). The combined result is reasonable, but as expected 
from the discussion above the two Gaussians do not clearly 
divide spiral from elliptical galaxies. In particular, the 'blue 
elliptical' population forms a substantial contribution to the 
blue side of the redder of the two gaussians. This result il- 
lustrates the importance of true morphological classification; 
even a sophisticated division between 'red' and 'blue' sys- 
tems will not entirely separate the two morphological types. 

In order to explore further the properties of our sam- 
ple in colour-magnitude space, we construct three volume- 
limited subsamples from those objects in the clean sample 



— 1 1 1 r- 



-1 1 1 [— 




redshift 

Figure 11. Cuts applied to create volume limited subsamples 
from the clean sample. The curve shows the r=17.77 line con- 
verted to Mr using the distance modulus but neglecting k- 
corrections, and corresponds to the main galaxy sample limit 
for an object with a fiat spectrum. Points below (and some just 
above) this line are drawn from the LRG sample. 



for which spectroscopic redshift s have been obtained. In or- 
der to improve confidence in the data, samples were con- 
structed both for r < 17.77 (solid lines) and r <17.0 (dashed 
lines). The cuts applied are illustrated in Figure 1111 The 
most luminous sample is dominated by elliptical galaxies, 
with a ellipticalispiral ratio of 1.99. The intermediate sam- 
ple has a ratio of 0.98, and is thus evenly split between the 
two classes, whereas the sample including the faintest galax- 
ies is dominated by spirals, with a ratio of 0.57. 

Colour-magnitude diagrams for each of these subsam- 
ples are shown in Figure 1121 We also show Gau ssian fits to 
the data based on those in iBaldrv et alJ (|200J) . Baldry et 
al. divide galaxies drawn from the SDSS into red and blue 
systems, defining a galaxy as red if C'ur > Cur where Cur is 
the rest-frame (k-corrected) u-r colour and 



C' =2.06 -0.244 tanh 



/M^ + 20.07^ 






1.09 



(3) 



Fits to the data shown in Figure [12] are gaussians with 
the same mean and variance as those derived in Baldry et 
al. These gaussians were then normalized to fit our data set. 

We show the results in Figure 1121 where some general 
trends are immediately apparent. The proportion of galaxies 
classified as ellipticals is larger in the sample which includes 
only the most luminous galaxies. The results also confirm as 
before that in none of the three cass the two distributions 
(red and blue galaxies) which would be derived in the ab- 
sence of morphological information cannot be simply inter- 
preted as corresponding to 'early' and 'late' type galaxies. It 
is not possible to define a single colour with which to divide 
the two classes of galaxy; rather the distributions overlap to 
a large extent. 

The biggest difference between the populations inferred 
from gaussian fitting and those obtained by our morpholog- 
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ical classification is the presence of a substantial number of 
red galaxies which were classified as spiral systems in the 
lower luminosity samples. In fact, the population of galaxies 
which we classify as morphologically spiral contains a sub- 
stantial number of systems with u-r colours greater than 
^2.2. Many of these systems may actually be lenticulars; 
however, distinguishing between well resolved edge-on SO 
galaxies from an edge-on spiral is impossible by visual clas- 
sification alone. Despite this contamination, however, true 
red spirals do exist in the data and morphological and colour 
bimodality are - at least for this intriguing population - de- 
coupled. There is a corresponding population of blue ellip- 
tical galaxies, the most extreme examples of which are the 
subject of a companion paper (Schawinski et al. 2008), but 
they are less significant here. 



6 CONCLUSION 

We have described Galaxy Zoo, a web-based project which 
invited the public to classify galaxies imaged as part of the 
Sloan Digital Sky Survey. By combining the classifications 
of more than 100,000 participants in the largest astronom- 
ical collaboration in history, we are able to produce cata- 
logues of galaxy morphology which agree with those com- 
piled by professional astronomers to an accuracy of better 
than 10%. Our results thus suggest that the general public 
can reliably classify large sets of galaxies with a similar ac- 
curacy as can professional astronomers. The largest of the 
Galaxy Zoo catalogues includes more than 300,000 galax- 
ies reliably classified at more than 5cr confidence according 
to morphology, a factor of ten larger than previous work. 
Due to the repeated, independent classifications of the same 
object it is possible to quantify the errors in the classifica- 
tion, and produce catalogues of differing fidelity for different 
purposes (such as the clean and superclean catalogues dis- 
cussed here). By examining a volume- limited subset of the 
data in colour- magnitude space we illustrate the differences 
between the colour and morphological bimodalities in the 
data. The presence of a substantial number of red galaxies 
classified as spiral in particular underlines the importance of 
morphological classification; our results show that a tradi- 
tional morphological classification cannot be reproduced by 
cuts on colour alone. 



Figure 12. Colour-magnitude diagrams for our set of three vol- 
ume limited samples. Petrosian magnitudes are used, and k- 
corrections and absolute magnitudes derived from spectroscopic 
redshifts. As in Figure [Tol crosses represent ellipticals, diamonds 
spirals and the histogram the combined data set. Gaussian fits 
were made to the combined data (spirals and ellipticals) but only 
the individual Gaussian s are shown. The ve rtical dotted lines are 
the limits proposed by (jBaldrv et al.ll2004r ) for dividing red and 
blue galaxies; systems to the right of this line are defined as red. 
As definied in equation [S] this limit depends on the absolute mag- 
nitude and we thus show the limit for both minimum and maxi- 
mum Mr in each range. 
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